Range- limited Centrality Measures in Complex Networks 



Maria Ercsey-Ravasz,^' ^'j^] Ryan Lichtenwalter,^' ^ Nitesh V. Chawla,^'^ and Zoltan Toroczkai^'^'^'j^ 

^Faculty of Physics, Babe§-Bolyai University, Str. Kogalniceanu Nr. 1, RO-400084 Cluj-Napoca, Romania 
^Interdisciplinary Center for Network Science and Applications (iCeNSA), 
University of Notre Dame, Notre Dame, IN, 4^556 USA 

^Department of Computer Science and Engineering, 
University of Notre Dame, Notre Dame, IN, 4-6556 USA 
^Department of Physics, University of Notre Dame, Notre Dame, IN, 46556 USA 

(Dated: November 24, 2011) 

Here we present a range-limited approach to centrality measures in both non-weighted and 
weighted directed complex networks. We introduce an efficient method that generates for ev- 
ery node and every edge its betweenness centrality based on shortest paths of lengths not longer 
than I = 1,...,L in case of non- weighted networks, and for weighted networks the correspond- 
ing quantities based on minimum weight paths with path weights not larger than wi — ^A, 
I = 1,2. ..,L = i?/A. These measures provide a systematic description on the positioning im- 
portance of a node (edge) with respect to its network neighborhoods 1-step out, 2-steps out, etc. up 
to including the whole network. They are more informative than traditional centrality measures, as 
network transport typically happens on all length-scales, from transport to nearest neighbors to the 
farthest reaches of the network. We show that range-limited centralities obey universal scaling laws 
for large non- weighted networks. As the computation of traditional centrality measures is costly, 
this scaling behavior can be exploited to efficiently estimate centralities of nodes and edges for all 
ranges, including the traditional ones. The scaling behavior can also be exploited to show that the 
ranking top-list of nodes (edges) based on their range-limited centralities quickly freezes as function 
of the range, and hence the diameter-range top-list can be efficiently predicted. We also show how 
to estimate the typical largest node-to-node distance for a network of N nodes, exploiting the afore- 
mentioned scaling behavior. These observations are illustrated on model networks and on a large 
social network inferred from cell-phone trace logs (^ 5.5 x 10^ nodes and ^ 2.7 x 10^ edges). Fi- 
nally, we apply these concepts to efficiently detect the vulnerability backbone of a network (defined 
as the smallest percolating cluster of the highest betweenness nodes and edges) and illustrate the 
importance of weight-based centrality measures in weighted networks in detecting such backbones. 

PACS numbers: 89.75.Hc, 89.65.-s, 02.10.Ox 



I. INTRODUCTION 



Network research JV^ has experienced an explosive 
growth in the last two decades, as it has proven itself 
to be an informative and useful methodology to study 
complex systems, ranging from social sciences through 
biology to communication infrastructures. Both the nat- 
ural and man made world is abundant with networked 
structures that transport various entities, such as infor- 
mation, forces, energy, material goods, etc. As many of 
these networks are the result of evolutionary processes, 
it is important to understand how the graph structure of 
these systems determines their transport performance, 
structural stability and behavior as a whole. A rather 
useful concept in addressing such questions is the no- 
tion of centrality, which describes the positioning "im- 
portance" of a structure of interest such as a node, edge 
or subgraph with respect to the whole network. Although 
the notion of centrality in graph theory dates back to the 
mathematician Camille Jordan (1869), centrality mea- 
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sures were expanded, refined and applied to a great ex- 
tent for the first time in social sciences [6-10^ , and today 
they play a fundamental role in studies involving a large 
variety of complex networks across many fields. Probably 
the most frequently used centrality measure is between- 
ness centrality (BC) p^OUTB] . introduced by Anthonisse 
[11 and Freeman [T? defined as the fraction of all net- 
work geodesies (shortest paths) passing through a node 
(edge or subgraph). Since transport tends to minimize 
the cost/time of the route from source to destination, it 
expectedly happens along geodesies, and therefore cen- 
trality measures are typically defined as a function of 
these, however generalizations to arbitrary distributions 
of transport paths have also been introduced and studied 
pTl [T8] . Geodesies are important for structural connec- 
tivity as well: removing nodes (edges) with high BC, one 
obtains a rapid increase in diameter, and eventually the 
structural breakup of the graph. 

In general, centrality measures are defined in the con- 
text of the assumptions (sometimes made implicitly) re- 
garding the type of network flow [16 . These are as- 
sumptions regarding the nature of the paths such as be- 
ing shortest, or arbitrary length paths, weighted/ valued 
paths, walks (repeated nodes and edges) [19] etc.; and 
the nature of the flow^ such as transport of indivisible 
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units (packets), or spreading/broadcasting processes (in- 
fection, information). Besides betweenness centrality, 
many other centrality measures have been introduced 
[TCT, depending on the context in which network flows 
are considered; for a partial compilation see the paper 
by Brandes [14 , here we only review a limited list. In 
particular, stress centrality [2Q"-^22^, simply counts the 
number of all-pair shortest paths passing through a node 
(edge) without taking into account the degeneracy of the 
geodesies (there can be several geodesies running between 
the same pair of nodes). Closeness centrality jS) \T3\ [23 ] 
and its variants are simple functions of the mean geodesic 
distance (hop-count) of a node from all other nodes. Load 
centrality [HI [24l [25] is generated by the total amount of 
load passing through a node when unit commodities are 
passed between all source-destination pairs using an algo- 
rithm in which the commodity packet is equally divided 
amongst the neighbors of a node that are at the same 
geodesic distance from the destination. Group betwee- 
ness centrality [26l [27] computes the betweenness associ- 
ated with a set of nodes restricted to all-pair geodesies 
that traverse at least one of the nodes in the group. Ego 
network betweenness [28 is a local betweenness measure 
computed only from the immediate neighborhood of a 
node (ego) . Eigenvector centrality [29l |30] represents a 
positive score associated to a node, proportional to the 
sum of the scores of the node's neighbors, solved consis- 
tently across the graph. The corresponding score vector 
is the eigenvector associated with the largest eigenvalue 
of the adjacency matrix. Random i^'a/A: centrality [31, 32 
is a measure of the accessibility of a node via random 
walks in the network. Other centrality measures include 
information centrality of Stephenson and Zelen [33] ; and 
induced endogenous and exogenous centrality by Everett 
and Borgatti [3 4 . 

Bounded- distance betweenness was introduced by Bor- 
gatti and Everett [10] as betweenness centrality result- 
ing from all-pair shortest paths not longer than a given 
length (hop-count). It is this measure that we expand 
and investigate in detail in the present paper. A con- 
densed version for unweighted networks has been pre- 
sented in Ref. [35]. Since we are also generalizing the 
measure and the corresponding algorithm to weighted 
(valued) networks, we are referring to it as range-limited 
centrality. Note that range-limitation can be imposed on 
all centrality measures that depend on paths, and there- 
fore the analysis and algorithm presented here can be 
extended to all these centrality measures. 

Centrality measures have received numerous applica- 
tions in several areas. In social sciences they have been 
extensively used to quantify the position of individuals 
with respect to the rest of the network in various so- 
cial network data sets [6,|16]. In physics and computer 
science they have seen widespread applications related 
to routing algorithms in packet switched communication 
networks and transport problems in general [iTl IMl l36] - 
[41] . The connection of generalized betweenness central- 
ity based on arbitrary path distributions (not just short- 



est) to routing that minimizes congestion has been in- 
vestigated by Sreenivasan et al [T7| using minimum spar- 
sity vertex separators. This makes a direct connection 
to max-flow min-cut theorems of multicommodity flows, 
extensively studied in the computer science literature 
j42l l43j . Other works that use essentially edge between- 
ness type quantities to quantify congestion in Internet- 
like graphs include Refs [44, 45 . Dall'Asta et.al. connect 
node and edge detection probabilities in traceroute-based 
sampling of networks to their betweenness centrality val- 
ues [46l [47] . Other applications include detection of net- 
work vulnerabilities in face of attacks [48 , cascading fail- 
ures j49H5T] or epidemics ^ , all involving betweenness- 
related calculations. 

An important extension of centrality is to weighted, 
or valued networks [25l [53H57] . In this case the edges 
(and also the nodes) carry an associated weight, which 
may represent a measure of social relationship in social 
networks [58^, channel capacity in the case of communi- 
cation networks, transport capacity (e.g., nr of lanes) in 
roadway networks or seats on flights [59 . 

From a theory point of view, there have been fewer 
results, as producing analytic expressions for centralities 
in networks is difficult in general. However, for scale-free 
trees, Szabo et.al. [60 developed a mean-fleld approach 
for computing node betweenness, which later was made 
rigorous by Bollobas and Riordan [61 . Fekete et.al. pro- 
vide a calculation of the distribution of edge betweeness 
on scale-free trees conditional on node in-degrees [62] . 
and Kitsak et.al. [63 have derived scaling results on be- 
tweenness centrality for fractal and non-fractal scale-free 
networks. 

Unfortunately, computation of betweenness can be 
costly {0{NM), where N is the number of nodes and 
M is the number of edges, thus 0{N^) worst case) 
[T4] [25] [54] [64ll66] , especially for large networks with mil- 
lions of nodes, hence approximation methods are needed. 
Existing approximations [32] [67] [68] , however, are sam- 
pling based, and not well controlled. Additionally, trans- 
port in real networks does not occur with uniform prob- 
ability between arbitrary pairs of nodes, as transport in- 
curs a cost, and therefore shorter-range transport is ex- 
pectedly more frequent than long-range. Accordingly, 
the usage of network paths is non-uniform, which should 
be taken into account if we want to connect centrality 
properties with real transport. In order to address some 
of the limitations of existing centrality measures, we re- 
cently focused on range-limited centrality [35]. We have 
shown that when geodesies are restricted to a maximum 
length L, the corresponding range-limited L-betweenness 
for large graphs assumes a characteristic scaling form as 
function of L. This scaling can then be used to pre- 
dict the betweenness distribution in the (diflicult to at- 
tain) diameter limit, and with good approximation, to 
predict the ranking of nodes/edges by betweenness, sav- 
ing considerable computational costs. Additionally, the 
range-limited method generates /-betweenness values for 
all nodes and edges and for all 1 < I < providing 
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FIG. 1: a) Consecutive shells of the C3 subgraph of node i (black) are colored red, blue, green. Grey elements are not part of 
the subgraph, b) The (x,y,z) near a node j are the b£(i\j) values for ^ = 1, ^ = 2 and i = 3. The [x,y,z] on an edge {j,k) 
give the bi{i\j, k) values for ^ = 1, = 2 and i = 3 . Given a node j, the number inside its circle is the total number of shortest 
paths aij to j from i. Colors indicate quantities based on ^ = 1 (red), i = 2 (blue), and i = 3 (green). 



systematic information on geodesies on all length-scales. 

In this paper we give a detailed derivation of the al- 
gorithm and the analytical approximations presented in 
[35 and we demonstrate the efficiency of the method on 
a social network (SocNet) inferred from mobile phone 
trace-logs [69j. This network has a giant cluster with 
= 5,568,785 nodes and M = 26,822,764 directed 
edges. The diameter of the underlying undirected net- 
work is approximately D c:^ 26 and the calculation of 
the traditional (diameter-range based) BC values (using 
Brandes' algorithm) on this network took 5 days on 562 
computers. 

In addition, we present the derivations for an algo- 
rithm that efficiently computes range-limited centralities 
on weighted networks. We then apply these concepts and 
algorithms to the network vulnerability backbone detec- 
tion problem, and show the differences between the back- 
bones obtained with both hop-count based centralities 
and weighted centralities. 

The paper is organized as follows. Section [IT] intro- 
duces the notations and provides the algorithm for un- 
weighted graphs; sect ion |TlT] gives an analytical treatment 
that derives the existence of a scaling behavior for cen- 
trality measures in large graphs; it gives a method on how 
to estimate the largest typical node-to-node distance (a 
lower-bound to the diameter); discusses the complexity 
of the algorithm and the fast freezing phenomenon of 
ranking by betweenness of nodes and edges. Section [TV| 
illustrates the power of the range-limited approach (by 
showing how well can one predict betwenness centrali- 
ties and ranking of individual nodes and edges) using the 
social-network data described above. Section \V\ describes 
the algorithm for weighted graphs and section [Vl] uses the 



range-limited betwenness measure to define a vulnerabil- 
ity backbone for networks and illustrates the differences 
in identification of the backbone obtained with and with- 
out weights on the links. 

II. RANGE-LIMITED CENTRALITY FOR 
NON- WEIGHTED GRAPHS 

A. Definitions and notations 

Let us consider a directed simple graph G{V, E)^ which 
consists of a set V of vertices (or nodes) and a set 
E <ZV xV oi directed edges (or links). We will denote by 
ivi^Vj) G ^ an edge directed from node Vi ^ V io node 
Vj e V. The graph has TV nodes and M < N{N - 1) 
edges. The algorithm below can easily be modified for 
undirected graphs, we will not treat that case separately. 
A directed path ujrnn from some node m to a node n 
is defined as an ordered sequence of nodes and links 
^mn = {m, {m, Vl), Vl, {vi,V2),V2, ....vi,{vi,n),n} with- 
out repeated nodes. The "distance" d{m, n) is the length 
of the shortest directed path going from node m to node 
n. We give a definition of distance (path weight) for 
weighted networks in Section [Vl In non-weighted net- 
works the directed path length is simply the number of 
edges ( "hop- count " ) along the directed path from m to 
n. There can be multiple shortest paths (same length), 
and we will denote by cFmn the total number of shortest 
directed paths from node m to n. cFmn{'i) will represent 
the number of shortest paths from node m to node n 
going through node i. As convention we set 

(Jrnnij^) = CFrnnij^) — ^mn-) ^mm{'^) — ^i,m • (1) 
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The total number of all-pair shortest paths running 
through a node i is called the stress centrality (SC) of 
node i, S{i) = ugv ^^^(^)* Betweenness central- 
ity (BC) [Mini [HI [B] normalizes the number of paths 
through a node by the total number of paths (amn) for 
a given source-destination pair {m^n): 

B{i)= J2 (2) 

Similar quantities can be defined for an edge (j, k) G E: 
BU,k)=^^^^^^. (3) 

In order to define range-limited betweenness centrali- 
ties, let bi{j) denote the BC of a node j for all-pair short- 
est directed paths of fixed, exact length Then 

L 

BL{j) = Y.k{j) (4) 

represents the betweenness centrality obtained from 
paths not longer thdiH L. For edges, we introduce bi{j^ k) 
and BL{j^k) using the same definitions. For simplicity, 
here we include the start- and end-points of the paths 
in the centrality measures, however, our algorithm can 
easily be changed to exclude them, as described later. 

Similar to other algorithms, our method first calculates 
these BCs for a node j (or edge (j, k)) from shortest di- 
rected paths all emanating from a "root" node i, then it 
sums the obtained values for all i G V to get the final 
centralities for node j (or edge (j, k)). This can be done 
because the set of all shortest paths can be uniquely de- 
composed into subsets of shortest paths distinguished by 
their starting node. Thus it makes sense to perform a 
shell decomposition of the graph around a root node i. 
Let us denote by CL{i) the L-range subgraph of node i 
containing all nodes which can be reached in at most L 
steps from i (Fig. [l^). Only links which are part of the 
shortest paths starting from the root i to these nodes are 
included in C^. We decompose Cl into shells Gi{i) con- 
taining all the nodes at shortest path distance / from the 
root, and all incoming edges from shell / — 1, Fig. [Tb). 
The root i itself is considered to be shell {Go{i) = {T}). 
Let 

b:m=Y.^^ 5[(zb-,fe)=^^^ (5) 

denote the fixed-/ betweenness centrality of node /c, and 
edge (j, /c), respectively, based only on shortest paths 
all starting from the root i. Here r is not an indepen- 
dent variable: given i and k (or (j, /c)), r is the radius of 
shell Gr{i) containing k (or (j, /c)), that is /c G Gr{i) and 
(j, /c) G Gr{i)' Note that (Jin{k) = (or ain{j,k) = 0) if 
k (or (j^k)) do not belong to at least one shortest path 
from z to n, and thus there is no contribution from those 



points n from the l-th shell. The condition for k (or (j, k)) 
to belong to at least one shortest path from i to n can al- 
ternatively be written in the case of ([5| as d{k, n) = I — r 
a notation, which we will use later. 

For simplicity of writing, we refer to the fixed-/ be- 
tweenness centralities (the bi-s) as "/-BCs" and to the 
cumulative betweenness centralities (the Bl-s obtained 
from summing the /-BCs, see Q) as [Lj-BCs. 

B. The range- limited betweenness centrality 
algorithm 

While the basics of our algorithm are similar to Bran- 
des' [m [55, we derive recursions that simultaneously 
compute the [/]-BCs for all nodes and all edges and for 
all values / = 1, . . . , L. The algorithm thus generates de- 
tailed and systematic information (an L-component vec- 
tor for every node and every edge) about shortest paths 
on all length-scales and thus, providing a tool for multi- 
scale network analysis. 

First we give the algorithm, then we derive the specific 
recursions used in it. For the root node i we set the 
initial condition: an = 1. For other nodes, /c 7^ i, we 
set cTik = 0. The following steps are repeated for every 
1 = 1,. ..,L: 

1. Build Gi{i)^ using breadth-first search. 

2. Calculate for all nodes k G Gi{i), using: 

and set 

b\m = 1. (7) 

3. Proceeding backwards, through r = / — 0: 

a) Calculate the /-BCs of links (j, /c) G Gr-\-i{i) 
(thus j G Gr{i)^ k G Gr-\-i{i)) recursively: 

bl+\i\j,k)=bl+\i\k)^, (8) 

b) and of nodes j G Gr{i) using ([8| and: 

bni\j)= E ^r'(^b'^)- (9) 

(j,fc)eG^+i(i) 

4. Finally, return to step 1) until the last shell Gl(0 
is reached. 

In the end, the cumulative [/]-BCs, that is the Bi-s can 
be calculated using Q. Fig. 1 shows a concrete exam- 
ple. The subgraph of node i has three layers. Each layer 
Gi{i) and the corresponding /-BCs are marked with dif- 
ferent colors: / = 1 (red), / = 2 (blue), and / = 3 (green). 
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As described above, the first step creates the next layer 
Gi{i), then in step 2., for every node k G Gi{i) we cal- 
culate the total number of shortest paths from the 
root to node k. These are indicated by numbers within 
the circles representing the nodes in Fig. [l] (e.g., ciij = 1, 
dik = 2, din = 5). As given by ([6|, aik is calculated by 
summing the number of shortest paths that end in the 
predecessors of node k located in Gi-i{i). For example 
node p e Gs{i) in Fig. [l] is connected to nodes k and m 
in shell 6^2 (0? thus: aip = a^k + (Jim = 2 + 1 = 3. 

Eq. ^ states that the l-BC of nodes located in Gi{i) 
is always 1. This follows from Eq. ([5| for r = / and using 
the convention (Jik{k) = cr^/c. Knowing these values, we 
proceed backwards (step 3.) and calculate the /-BCs of 
all edges and nodes in all the previous layers. Recursion 
(|8| is obtained from a well known recursion for shortest 
paths. If k (or (j, k)) belongs to at least one shortest path 
going from i to n, then ain{k) = aikdkn and ain{j, k) = 
(JijCFkn' Inserting these in Eq. ([5| for r r+1 we obtain: 



h\+'{i\k) 



E 

d(k ,n) = l — r 
d(k,n) = l-r- 



O'kn 

-1 



(10) 

(11) 



where d{k^ n) = I — r — 1 expresses the condition that 
the sum is restricted to those n from Gi{i)^ which have 
at least one shortest path (from z), going through k or 
(j^k). Dividing these equations we obtain ([8|. For e.g., 
in Figjl] blii\k,n) = bl{i\n)a,k/cT,n = 1 x (2/5) = 2/5. 

Having determined the /-BCs of all edges in layer 
Gr+i(^), we can now compute the l-BC of a given node in 
Gr{i) by summing the l-BCs of its outgoing links, that is 
using (§ (e.g.,onFig[l] bl{i\k) = bl{i\k,p)^bl{i\k,n) = 
(2/3) + (2/5) = 16/15): 

This algorithm can be easily modified to compute other 
centrality measures. For example, to compute all the 
range-limited stress centralities, we have to replace Eq. 
^ with: s\{i\j) = aij. All other recursions will have 
exactly the same form, we just need to replace the l-BCs 
(6[(i|i), bj{i\j,k)) with the ;-SCs (s[(i|i), sl{i\j,k)). 

If we want to exclude start- and end-points when com- 
puting BCs or SCs, we first let the above algorithm finish, 
then we do the following steps: a) set the l-BC of the root 
node i to 0, b^{i\i) = for alH = 1, . . . , L, and b) for ev- 
ery node k G Gi{i) reset b\{i\k) =0, for all / = 1, . . . , 
(for e.g., on Fig IT] /c is in the second shell, G2{i)^ so its 2- 
BC will become instead of 1). Then via (|4|, the [/]-BCs 
and the corresponding [/]-SCs are easily obtained. 



III. CENTRALITY SCALING - ANALYTICAL 
APPROXIMATIONS 



for all sufficiently large random networks that we stud- 
ied (Erdos-Renyi (ER), Barabasi- Albert (BA) scale- free. 
Random Geometric Graphs (RGG), etc.) including the 
social network inferred from mobile phone trace-log data 
(SocNet) [70^. Here we detail the analytical arguments 
that indeed show that the existence of this scaling be- 
havior for large networks is a general property, by ex- 
ploiting the scaling of shell sizes. The scaling of shell 
sizes was already studied previously, for e.g., in random 
graphs with arbitrary degree distributions |7T1 [72] . For 
simplicity of the notations, we only show the derivations 
for undirected graphs. 



A. Betweenness of individual nodes 

Let us define (•) as an average over all root nodes i in 
the graph, and denote by zi{i) the number of nodes on 
shell Gi{i). We define the branching factor as: 

ai = {zi^i)/{zi) , (12) 

and model the growth of shell sizes as a branching process 



zi^i{i) = zi{i)ai[l ^ ei{i)] . 



(13) 



Here ei{i) is a per-node^ shell occupancy noise term, en- 
coding the relative deviations, or fiuctuations from the {i- 
independent) functional form of ai. Typically, |e/| <C 1, 
it obeys {ei{i)) = and (e^(z)e^(j)) = 2Aidi^rn^ij, with 
Ai decreasing with I. In undirected graphs if z G Gm{j) 
then it implies that j G G^(z), and vice- versa. Hence, in 
this case: 



The 1/2 factor comes from the fact that any given path 
will be included twice in the sum (once in both direc- 
tions). In case of m = the only node in Go{j) is j 
itself, and the inner sum is equal with b^_^-^{j\j). Due 
to convention ([T]) (Jjn{j) — o'jn and hence from ([5| we 



obtain b^^^{j\j) = J], 



(^jn{j)/crjn = zi+iUT For 



m = / + 1, 6|^]^(i|j) = 1 (see Eq. til) and the inner sum 
is again Z|+i(j). Thus we can write: 



m=lieG™(j) 



(15) 



Note that the number of terms in the inner sum 
EiGG^(j) ^l+i(^b') is ^m{j), which is rapidly increasing 
with m, and thus is expected to have a weak dependence 
on j. Accordingly, we make the approximation: 



In [35] we have shown that the [/]-BC obeys a scal- 
ing behavior as function of I. This was found to hold 
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(16) 



m=lieGmU) 
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where we replaced b'^^{i\j) by vf^-^{i), which is an av- 
erage {I + 1)-BC computed over the sheh of radius m, 
centered on node i : 



vr+iii) 



(17) 



However, the sum of (/ + l)-BCs in any m < I -\- 1 
layer is equal with the number of nodes in shell G^+i: 
^keGmii) ~ We can convince ourselves 

about this last statement by using ([5| and observing 
that X]/cGG^(i) ^in{k) = as all paths from i to n 
{n G G/+i(z)) must "pierce" every shell m < / + 1 in 
between. Fig. [l] shows an example: there are 3 nodes in 
Gs and the sum of 3-betweenness values (green) in layer 
G2 is (7/5) + (16/15) + (8/15) = 3. Therefore, we may 
write: 

where we used the recursion defined above for zi-^i{i) as a 
branching process (13). Inserting this in (p^Gl) we obtain: 



zi{i)[l^ei{i)] 



ai 



Zm(i) 



E E 



ai 



i-i 



+ E E 



m=l ieGmU) 



(19) 



where we neglected the small noise term due to the large 
number of terms in the inner sum, and we used the fact 
that for m = / the leading term of the inner sum is just 
zi{j). From Eqs. ([l6| and (18), however, the double sum 
in (Il9| equals ui{j) and we obtain the following recursion: 



ui^i{j)c^ai[zi{j)^ui{j)]. (20) 

Eqs ([l3|, ( p!5| ) and ( [2Q| ) lead to a recursion for 6^+i(j): 

^ ai[bi{j) + zi{j)/2 + zi{j)ei{j)], (21) 

which can be iterated down to / = 1, where bi{j) = 
^i(j) = kj is the degree of j: 



kU) /3i kj e^'^^^ , 



(22) 



with 



^-1 



R ^ + ^ TT ^ l^^izi) .... 



2 {k) 



i-i 



n—l 



(24) 



In many networks, the average shell- size (zi) grows ex- 
ponentially with the shell-'radius' / (for e.g., ER, BA, 
SocNet), implying a constant average branching factor 
larger than one: 



ai 



a 



(£2) 

(k) 



> 1 



(25) 



The exponential growth holds until / reaches the typical 
largest shortest path distance L*, beyond which finite- 
size effects appear. Accordingly, f3i ~ and bi grows 
exponentially with In this case, since bi is rapidly in- 
creasing with the cumulative BL{j) = ^f=ibi{j) will 
be dominated by 6^, and thus Bl obeys the same expo- 
nential scaling as 6^, confirmed by numerical simulations 
(Fig. 3c in [35 shows this scaling for SocNet). 

However, not all large networks have exponentially 
growing shell-sizes. For example, in spatially embed- 
ded networks without shortcuts such as random geomet- 
ric graphs, roadways, etc., average shell-size grows as a 
power law (zi) ~ where d is the embedding di- 

mension of the metric space. In this case f3i ~ and 
bi{j) - and Bl - L"^^^. Fig 3d in [35^ shows this 
scaling for RGG graphs embedded in d = 2 dimensions. 



B. Distribution of /-betweenness centrality 



Eq (22) allows to relate the statistics of fixed-/ be- 



tweenness to the statistics of shell occupancies for net- 
works that are uncorrelated, or short-range correlated. 
Since the noise term (obtained from per-node occupancy 
deviations on a shell) is independent on the root's degree 
in this case, the distribution of fixed-/ betweenness can 
be expressed as: 



Piib) 



{Hbi{j)-b)) 

/OO nN—1 
I dk 5 {pM - h) P{k)^i{0 ■ (26) 
-c 



-CX) J 1 



where 5{x) is the Dirac-delta function, P{k) is the de- 
gree distribution and is the distribution for the 
noise ^z(j), peaked at ^ = 0, with fast decaying tails 
and ^i(x) = 5{x). Performing the integral over the noise 
^, one obtains the distribution for /-BC, in form of a con- 
volution: 



Piib) 



N-1 



dkP{k)^i{\nb-\npi-\nk) . (27) 



From (27) follows that the natural scaling variable for 
betweenness distribution is u = ln6 — ln/3^. The noise 
distribution (for / > 1) may introduce an extra l- 
dependence through its width a/, which can be accounted 
for via the rescaling u ^ u/ai^ pi ^ piCJi^ thus col- 
lapsing the distributions for different /-values onto the 
same functional form, directly supporting our numerical 
observations presented in Ref [35^. As is typically 
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sharply peaked around 0, the most significant contribu- 
tion to the integral (27) for a given b comes from degrees 
k b/Pi. Since /c > 1, we have a rapid decay of pi{b) in 
the range b < a maximum ai b = Pik where k is the 
degree at which P{k) is maximum, and a sharp decay for 
6> (7V-1)A. 



C. Estimating the average node-to-node distance 
in large networks. 

The scaling law on its own does not provide infor- 
mation about the typical largest node-to-node distance, 
which is always a manifestation of the finiteness of the 
graph. However, knowing the size of the network in terms 
of the number of nodes A^, one can exploit our formulas 
to find the average largest node-to-node distance as the 
radius L* of the typical largest shell beyond which finite- 
size effects become strong, that is where network edge 
effects appear. This can be estimated as the point where 
the sum of the average shell sizes reaches N. Hence: 



N , 



(28) 



providing an implicit equation for L*. The /3^-s are de- 
termined numerically for / = 1, 2, 3, . . . and a correspond- 
ing functional form fitting its scaling with / can be ex- 
trapolated for larger / values up to L*, when the sum 
in (28) hits N. For our social network data one obtains 
L* c± 9.35 (Fig. [2]). Here L* is not necessarily an inte- 
ger, because it is obtained from the scaling behavior of 
the average shell sizes, and represents the typical radius 
of the largest shell. Expression (28) can be easily spe- 
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FIG. 2: In the SocNet the sum, Zi, of the average shell sizes 
grows exponentially as function of /. Extrapolating, we can 
predict that it reaches the = 5, 568, 785 mark at L* ^ 9.35. 

cialized for the two classes of networks discussed above 
namely, for those having exponential average shell-size 
growth (zi) ^ {k)a^~^ and for those having a power-law 
average shell-size growth as (zi) ^ {k)l^~^. For the ex- 



ponential growth case we obtain: 



1 

In a 



In 1 



a 



(fc) 



-N 



(29) 



resulting in the L* ^ InA^ behavior for large N. 

For the power-law growth case there is no easily invert- 
ible expression for the sum, however, if we replace the 
summation with an integral, we find the approximate 



L* :^ 1 



d 

Jk) 



N 



l/d 



(30) 



expression, with the expected asymptotic behavior L* 
ATV^ as a/" ^ 00. 



D. Algorithm complexity 

We are now in position to estimate the average-case 
complexity of the range-limited centrality algorithm. For 
every root z, we sequentially build its / = 1, 2, L shells. 
When going from shell Gi-i{i) to building shell Gz(i), we 
consider all the zi-i nodes on Gi-i{i). For every such 
node j we add all its links that do not connect to already 
tagged nodes (a tag labels a node that belongs to Gi-i{i) 
or Gi-2{i)) to G/(i), and add the corresponding nodes as 
well. This requires on the order of (k) operations for 
every node j, hence on the order of (kUzi-i) operations 
for creating shell Gi{i). Next is Eq (pi, which involves 
(ei) steps, where ei is the number of edges connecting 
nodes in shell Gi-i{i) to nodes in shell Gi{i). Eq ^ 
involves (zi) steps. Eqs (|8| and ([9| generate a total of 
2 Ylm=ii^rn) Operations. Hence, for a given / there are a 

total of {k){zi-i)^{ei)^{zi)-^2Y^^^^^{em) operations on 
average. Thus the average complexity of the algorithm C 
can be estimated as: 



(31) 



1^1 



Note that the set of edges in the shells Gm-i(0 
Gm{i) are all fanning out from nodes in Gm-i{i)^ and 
thus we can approximate (em-i) + (em) with {k){zm-i)- 
Thus, the estimate becomes: 



c ~ N{k) J2 E (^™) = - ^ + (32) 



1=1 



From (32) it follows that 



N{k)^{zi)<C<LN{k)^{zi). 



(33) 



^=1 



^=1 



For fixed L, the complexity grows linearly with N diS N ■ 
00. For L = L* we can use (28) to conclude that 



0{NM) <C < 0{VNM) 



(34) 
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where M = N{k)/2 denotes the total number of edges 
in the network. Recah that the Brandes or Newman al- 
gorithm has a complexity of 0{NM) for obtaining the 
traditional betweenness centralities. Specializing the ex- 
pression (32) to networks with exponentially growing 



shells one finds the same 0{NM) complexity (that is the 
upper bound 0{NMhiM) in (34) is not realized); for 
networks with power-law growth shells, however, we find 
0(7Vi+i/^M), as in the upper bound of E^. The extra 



computational cost is due to the fact that instead of a sin- 
gle value, our algorithm produces a set of L numbers (the 
/-BCs), providing multiscale information on betweenness 
centrality for all nodes and all edges in the network. 



E. Freezing of ranking by range-limited 
betweenness 



In Ref [35 we have provided numerical evidence that 
the ranking of the nodes (same holds for edges) by their 
[L]-BC values freezes at relatively small values of L. Here 
we show how this freezing phenomenon emerges. Con- 
sider two arbitrary nodes i and j, with degrees hi and kj. 
Using Eq (22) we can write 



In 



Based on ([24| 



In^ 



l-l 



i + i 



i + i 



-Xn 



(35) 



(36) 



where Xn = en{j) - en{i). By definition, en{j) is 
the per node variation of shell-occupancy from its root- 
independent value, for the n-th shell centered on root 
node j. Expectedly, for larger shells (larger n), the size 
of the shells becomes less dependent on the local graph 
structure surrounding the root node, and for this reason 
this noise term has a decaying magnitude \en{j)\ with 
n. Thus, the X^, can be considered as random variables 
centered around zero, with a magnitude that is decay- 
ing with increasing n. The contributions of the noise 



terms coming from larger radius shells in the sum (36) 
is decreasing not only because the corresponding X^-s 
are decreasing in absolute value, but also because their 
weight in the sum is decreasing (as !/(/ + 1)), and there- 
fore when moving from / to / + 1 in (|36]) the change (the 
fiuctuation) in decreases for larger I. This effectively 
means that the rhs of (35) saturates^ and thus, accord- 



ingly, the Ihs saturates as well, freezing the ordering of 
betweenness values. If the two nodes have largely differ- 
ent degrees ihikj/ki is relatively large), the noise term 
will not be able to change the sign on the rhs of ( 35 ) , 



even for small / values, and thus, the ordering between 
nodes with very different degrees will freeze the fastest, 
followed by nodes with degrees that are close to each 
other. Clearly, the freezing of ordering between nodes 



with identical degrees {kj = ki) will happen last. The 
probability for the ordering to fiip when increasing the 
range from / to / + 1 can be calculated for specific network 
models, however, it will not be discussed here. 



IV. RANGE-LIMITED CENTRALITIES IN A 
LARGE-SCALE SOCIAL NETWORK 

In this section we illustrate the power of the range- 
limited approach on a real-world social network inferred 
from cell-phone call- logs (SocNet). We show that com- 
puting the [L]-BCs up to a relatively small limit length 
can already be used to predict the full, diameter-based 
betweenness centralities of individual nodes (and edges), 
their distribution and the top list of nodes with highest 
centralities. 

This social network was constructed from 708 million 
anonymized phone-calls between 7.2 million callers gen- 
erated in a period of 65 days. Restricting ourselves to 
pairs of individuals between which phone-calls have been 
observed in both directions in this period as a definition 
of an edge, we found that the giant component of this net- 
work has about 5.5 million nodes and 27 million edges. 
The 65 days is long enough to guarantee that individuals 
with strong social bonds have called each other at least 
once during this interval, and therefore will be linked by 
an edge in our graph. 

To test and validate our predictions using the range- 
limited method, we actually performed the computation 
of the full, diameter-based betweenness centralities of all 
the nodes in SocNet. To deploy the computation, we 
used a distributed computing utility called Work Queue, 
developed in the Cooperative Computing Lab at Notre 
Dame. The utility consists of a single management server 
that sends tasks out to a collection of heterogeneous 
workers/processors. Specifically, our workers consisted 
of 250 Sun Grid Engine cores, 300 Condor cores, and 12 
local workstation cores, for a total of 562 cores. This al- 
lowed us to finish thousands of days of computation in 
the course of 5 days. Each worker received a request to 
compute the contribution of shortest paths starting from 
50 vertices to the betweenness centrality of every vertex 
in the network, summed the 50 results, and sent them 
back to the management server. Each time the man- 
agement server received a contribution, it summed the 
contribution with all the others and provided another 50 
vertices for the worker. 

We also determined the network diameter from the 
data using a similar distributed computing method, ob- 
taining D = 26. At first sight this value seems to be 
at odds with the famous six-degrees of separation phe- 
nomenon, which implies a much smaller diameter. How- 
ever, there are two observations that one can make here. 
1) The social network has a dense core with protruding 
branches ("tentacles"), which mathematically speaking, 
can generate a large diameter. However, the experimen- 
tally determined six degrees of separation does not probe 
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all the branches, it actually relies on the denser core for 
information flow. Hence it should be rather similar to 
the average node-to-node distance, rather than the rig- 
orously defined network diameter. Indeed, the value of 
L* = 9.35 that we obtained is rather close to the six- 
degrees observation. 2) The social network constructed 
based on cell-phone communications gives only a sam- 
ple subgraph of the true social network, where commu- 
nications happen also face-to-face and through land-line 
phone calls. Hence, one would likely measure an even 
smaller L* would such data be available. 



timated in subsection HI C[ For low ranking nodes (small 



full BC) finite size effects should appear at lengths larger 
than L* , because they are situated towards the periphery 
of the graph. Indeed, one can see from Fig. [S] that node 
m reaches its full BC at / :^ 10.3, still fairly close to the 
estimated L*. 

Thus, once we determined L* as described in |III C[ 
then by simply extrapolating the fitting curve to the [/]- 
BCs of a given node up to / = L*, we obtain an esti- 
mate/lower bound for its full betweenness centrality. 



Predicting betweenness centralities of 
individual nodes 



B. Predicting BC distributions 



In large networks, where measuring the full between- 
ness centralities (i.e., based on all-pair shortest paths) 
is too costly, we can use the scaling behavior of range- 
limited BC values to obtain an estimate for the full BC 
value of a given node. Plotting the [/]-BC values mea- 
sured up to a limit L as function of we can extrapo- 
late to ranges beyond L. In any finite network the [/]- 
BC values will saturate, and thus we expect the appear- 
ance of finite size effects for large enough that is in 
the range L* < I < where L* is the typical radius 
of the largest shell and can be estimated as described 
in subsection IIIC In Fig. [s] we plot the [/]-BC values 
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FIG. 3: The [/]-BC values, Bi, of 4 individual nodes in the 
SocNet as fucntion of /. Range- limited measurements were 
made for / — 1,2,3,4,5, the exact BC value of each node 
is indicated by a horizontal dashed line. Extrapolating the 
range- limited values for larger /, the real BC value is reached 
at around 9.5 for nodes i, j, k and 10.3 for node m. 



{Bi{i)) for / < L = 5 for four nodes of SocNet. The 
four nodes were chosen to have very different Bi values. 
Ranking the nodes by their [/ = 5]-BC values, node i 
ranked the highest, and nodes j, k and m ranked 100, 
1000 and 10000, respectively. The horizontal dashed lines 
represent the full BC values of the nodes obtained from 
the exact, diameter-length based measurements (as de- 
scribed above). Fitting the five values and extrapolating 
the range-limited BCs, we can see that for nodes i, j, and 
A:, the curves reach their corresponding full BC at around 
/ 9.5 agreeing well with the typical length L* 9.35 es- 



In SocNet the Bi values have a lognormal distribution 
j35] . thus Qi{ln{Bi)) can be well fitted by a Gaussian 
(Fig. [4^). The parameters of the distribution also show 
a scaling behavior, and extrapolating up to L* = 9.35 
we obtain /i* = 17.28 for the average (Fig. ^) and a* = 
2.25 (Fig.|4j3) for the standard deviation of the Gaussian. 
This predicted distribution is shown as a dashed line on 
Fig. [4^. Comparing it with the distribution of the full 
BC values (/ = D) we can see that while the averages 
agree, the width of the distribution is, however, smaller 
than the predicted value. This is caused by the fact, 
that the [/]-BCs do not saturate at the same / value: for 
low centrality nodes saturation occurs at larger /, as also 
shown in Fig. [3] 



C. Predicting BC ranking 

Efficiently identifying high betweenness centrality 
nodes and edges is rather important in many appli- 
cations, as these nodes and edges both handle large- 
amounts of traffic (thus they can be bottlenecks or con- 
gestion hotspots), and form high- vulnerability subsets 
(their removal may lead to major failures). Fortunately, 
due to the freezing phenomenon described in subsection 
IIIE[ one does not need to compute accurately the full 
BC-s in order to identify the top ranking nodes and edges. 
At already modest / values we obtain top-lists that have 
a strong overlap with the ultimate, [/ = D]-BC top-list. 
Here we illustrate this for the case of SocNet. Table [J 
lists the [/]-BC (for / = 1, 2, 3, 4, 5 and / = D = 26) of the 
top 10 nodes from the [I)]-BC hst in SocNet. The over- 
lap between the top lists at consecutive / values increases 
with I. Given two lists, we define the overlap between 
their first (top-ranking) r elements by the percentage of 
common elements in both r-element lists. Table [Hi shows 
the overlap between the top list based on [5]-BC and the 
one based on the ultimate [I)]-BC values. At / = 5 the 
top 4 nodes are already exactly in the same order as in 
the [I)]-BC list, the overlap is 90% between the lists of 
the top 10 nodes, and even for the top 100 node lists we 
have an overlap of 75%. 
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TABLE I: Bi values of the top 10 nodes in the [D]-BC top-hst for SocNet, for / = 1, 2, 3, 4, 5, D, where D ^ is the diameter. 
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FIG. 4: a) Distribution Qi of the ln{Bi) values in the SocNet 
for / = 1, 2, 3, 4, 5, D, where D = 26 is the diameter, and the 
predicted distribution for L*. The distributions can be fitted 
with a Gaussian. b)The average /x and c) standard deviation 
a as function of /. Extrapolating to L* = 9.35 we obtain 
/i* = 17.28 and a* = 2.25. 
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100 
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100 
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100 
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100 


10 


90 


50 


72 


100 


75 


500 


70.2 


1000 


67.1 



TABLE II: Overlap between the lists of the top r nodes with 
highest [5]-BC and with the highest [D]-BC values. 



V. RANGE-LIMITED CENTRALITY IN 
WEIGHTED GRAPHS 

In unweighted graphs the length of the shortest path 
between two nodes is defined as the number of edges in- 
cluded in the shortest path. In weighted networks each 
edge has a weight or "length" : Wij . Depending on the na- 
ture of the network this length can be an actual physical 
distance (e.g., in road networks), or a cost or a resistance 
value. We define the "shortest" (or lowest- weight) path 
between nodes i and j as the network path along which 
the sum of the weights of the edges included is mini- 
mal. We will call this sum as the "shortest distance" 
d{i^j) from node i to node j (note that we allow for di- 
rected links, which implies that d{i^j) is not necessarily 
the same as d{j^i)). 

In order to define a range-limited quantity, let bi{j) 
denote the (fixed) l-BC of node j from all-pair short- 
est directed paths of length Wi-i < d < Wi^ where 
Wi < W2 < • • • < Wl are a series of predefined weight 
values or "distances". The simplest way to define these 
Wl distances is to take them uniformly Wi = I Aw, how- 
ever depending on the application these may be redefined 
in any suitable way. Bl will again denote the cumulative 
I/-betweenness, which represents centralities from paths 
not longer than Wl- Note that we are still counting 
paths when computing centralities, that is cr^n(0 ^^i^^ 
means the number of shortest paths from m to n passing 
through i, except for the meaning of "shortest", which is 
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now generalized to lowest-cost. 

The algorithm is similar to the one presented above for 
unweighted networks. We again build the subgraph of a 
node z, but now a shell Gi{i) will contain all the nodes 
k at shortest path distance Wi-i < d{i^k) < Wi from 
the root node i. An edge j ^ k is considered to be part 
of the layer in which node k is included. In unweighted 
graphs a connection j ^ k can be part of the subgraph 
only if the two nodes are in two consecutive layers: if 
j G Gr{i) then k G Gr+i(i). In weighted networks the 



W2=2 G3 W3=3 



situation is different (Fig. 
edges connecting nodes w 



5k)). In principle we may have 
^hich are not in two consecutive 
layers, but possibly further away from each other ( the 
links i ^ j ^ o in Fig.|5|i)), or even in the same layer 
(the link m ^ n in the same figure). 

When building the subgraph using breadth-first search, 
we need to save the exact order in which the nodes 
and edges are discovered and included in the subgraph 
(Fig. [5]3,c). Let us denote with v{p) the index of the 
node which is included at position p in this node's list 
(Fig. [5b). This means that the following conditions hold: 
d{i,v[l)) < d{i,v{2)) < d{i,v{3)) < .... Similarly we 
have a list of edges, where qx{p) Qy{p) is the edge in 
position p of the list, and Qy denote the indexes of the 
two nodes connected by the edge (Fig. Isj^) . This implies 
the conditions: d{i^qy{l)) < d{i^qy{2))< d{i^qy{3)) < 
. . . (note that every edge Qx (p) Qy {p) is included in 
the edge-list when node qy{p) is discovered). Again, we 
calculate bl{i\k) for a node /c, and 6[(z|j, /c) for an edge 
j ^ k. As defined above, these values take into ac- 
count only the shortest paths starting from node i, and 
r denotes the shell containing the corresponding node or 
edge. One uses the same initial conditions an = 1, and 
= for all /c 7^ z, as before. 
The algorithm has the following main steps. For every 
/ = !,.. .,L: 

1) We build the next layer Gi{i) using breadth first 
search. During this search we build the list of indexes 

Qx^ Qy as defined above. We denote the total number 
of nodes included in the list (from all shells Gi{i) up to 
Gi{i)) as Ni and the number of edges included as Mi. 
During this breadth-first search we also calculate the dik 
of the discovered nodes. Every time a new edge j ^ k is 
added to the list we update aik by adding to it aij (using 
algorithmic notation, a^^ := (Jik + Recall that a^^ 

denotes the total number of shortest paths from i to k. If 
the edge j ^ k is included in the subgraph (meaning that 
it is part of a shortest path) the number of shortest paths 
ending in j has to be added to the number of shortest 
paths ending in k. 

2) The /-betweenness of all nodes included in the new 
layer is set to b\{i\k) = 1, similarly to Eq. 

3) Going backwards through the list of edges we cal- 
culate the fixed-/ BC of all nodes and edges. For p = 
M/, . . . , 1, we perform the following recursions: 

a) for the edge qx{p) Qyip)- 
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bK^\Qx{p)^qy{p)) = b^{i\qy{p))- 



(37) 



FIG. 5: a) Shells of the C3 subgraph of node i (black) are 
colored red, blue, green. Distances defining the shells are: 
Wi — 1, W2 — 2, W3 — 3. The weight or length is shown 
next to each edge. Given a node j, the number inside its 
circle is the total number of shortest paths coming from the 
root i: aij. b) The list of nodes v(p) and c) list of edges 
Qx(p) Qy{p) are shown together with their 1-, 2-, and 3- 
betweenness values. 



b) immediately after the BC of an edge is calculated, the 
betweenness of node qx{p) must also be updated. We 
have to add to its previous value the /-BC of the edge 
Qxip) qy{p)' 

bUz\qx{p)) = b^{z\qx{p))^bUz\qx{p).qy{p)) (38) 

4) We return to step 1) until the last shell GL{i) is 
reached. 

As we have seen, the algorithm and the recursions are 
very similar to the one presented for unweighted graphs. 
The crucial difference is that the exact order of the dis- 
covered nodes and edges has to be saved, because the BC 
values of edges and nodes in a shell cannot be updated in 
an arbitrary order. As an example. Fig. [5] shows a small 
subgraph and the list of nodes and edges together with 
their 1-, 2- and 3-betweenness values. 



VI. VULNERABILITY BACKBONE 

An important problem in network research is identify- 
ing the most vulnerable parts of a network. Here we 
define the vulnerability backbone (VB) of a graph as 
the smallest fraction of the highest betweenness nodes 
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FIG. 6: The vulnerability backbone VB of a random geo- 
metric graph in the unit square with N = 5000, (k) = 5 
and D = 195. The top 30% of nodes are colored from red 
to yellow according to their [/]-BC ranking (see color bar). 
The VB based on the [/]-BC is shown for different values: 
/ = 1,2,5,15,45,195. 



forming a percolating cluster through the network. Re- 
moving simultaneously all elements of this backbone will 
efficiently shatter the network into many disconnected 
pieces. Although the shattering performance can be im- 
proved by sequentially removing and recomputing the 
top-ranking nodes [48], here we focus only on the simulta- 
neous removal of the one-time computed VB of a graph, 
the generalization being straightforward. 

Next we illustrate that range-limited BCs can be used 
to efficiently detect this backbone by performing calcu- 
lations up to a length much smaller than the diameter. 
This is of course expected in networks that have a small 
diameter {D = O(lnA^) or smaller), however, it is less 
obvious for networks with large diameter {D = 0{N^)^ 
a > 0). For this reason, in the following we consider ran- 
dom geometric (RG) graphs [73l [74] in the plane. The 



graphs are obtained by sprinkling at random N points 
into the unit square and connecting all pairs of points 
that are found within a given distance R of each other. 
We will use the average degree (k) = NttR^ [74 in- 
stead of R to parametrize the graphs. In Fig. [6] we 
present measurements on a random geometric graph with 
N = 5000 nodes, average degree (k) = 5. The hop- 
count diameter of this graph is D = 195. The weights 
of connections are considered to be the physical (Eu- 
clidean) distances. Clearly, since the links of the graph 
are built based on a rule involving the Euclidean dis- 
tances, the weight structure and the topology of the 
graph should be tightly correlated. Thus, we do expect 
strong correlations between the [/]-BC values measured 
both from the unweighted and the weighted graph. The 
weight ranges Wi defining the layers during the algorithm 
were chosen as Wi = 0.00725/, / = 1,...,I), so that 
Wd = 0m725D = 1.413 is close to the diagonal length 
of the unit square V^. The nodes and connections are 
colored according to their [/] — BC ranking for different 
/ values (see the color bar in Fig[6|. The backbone is al- 
ready clearly formed at / = 45. Fig. [7] compares the VBs 



weighted 



unweighted 




7 X 



V 



FIG. 7: Vulnerability backbones based on full BC rankings 
in two random geometric graphs with N = 5000 nodes, and 
average degrees {k) — 5 and {k) — 10, respectively. The 
rankings were calculated both on the unweighted graph (left 
column) and weighted one (right column). 

of the graphs obtained with and without considering the 
connection weights (distances). Two RGs with densities 
{k) = 5 and {k) = 10 are presented. In the case of the 
denser graph the backbone is concentrated towards the 
center of unit square, as periphery effects in this case are 
stronger (we do not use periodic boundary conditions). 
Although qualitatively the two VBs are similar, the VB 
is sharper and clearer in the weighted case. There can 
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FIG. 8: Comparison between the rankings obtained with and 
without considering the weights of connections for the two 
RG graphs in Fig. [t] Colors indicate the ln{rnw/rw) val- 
ues, where rnw is the rank of a node obtained using the 
non- weighted algorithm and is obtained with the weighted 
graph (see the color bar). In denser graphs the differences 
become more significant. 

be actually significant differences between the two back- 
bones, in spite the fact that one would expect a strong 
overlap. In Fig. |8] we show these differences by color- 
ing the nodes of the two graphs from Fig [7| according to 
the ln{rnw / ^^w) values, where Tnw is the rank of a node 
obtained using the non-weighted algorithm and is ob- 
tained using the weighted graph. The nodes are colored 
from blue to red, blue corresponding to the case when 
the unweighted algorithm strongly underestimates the 
weighted ranking of a node and red is used when it over- 
estimates it. Although it is of no surprise that weighted 
and unweighted backbones differ in networks where the 
graph topology and the weights are weakly correlated, 
the fact that there are considerable differences also for 
the strongly correlated case of random geometric graphs 
(the blue and red colored parts in the right panel of Fig 
[8| is rather unexpected, underlining the importance of 
using weigh-based centrality measures in weighted net- 
works. 



VII. CONCLUSIONS 

In this paper we have introduced a systematic ap- 
proach to network centrality measures decomposed by 
graph distances for both unweighted and weighted di- 
rected networks. There are several advantages to such 
range-based decompositions. First, they provide much 
finer grained information on the positioning importance 
of a node (or edge) with respect to the network, than the 
traditional (diameter-based) centrality measures. Tradi- 
tional centrality values are dominated by the large num- 



ber of long-distance network paths, even though most of 
these paths might not actually be used frequently by the 
transport processes occurring on the network. Due to 
the fast growth of the number of paths with distance in 
large complex networks, one expects that the distribu- 
tion of the centrality measures (which incorporate these 
paths) to obey scaling laws as the range is increased. 
We have shown both numerically and via analytic ar- 
guments (identifying the scaling form) that this is in- 
deed the case, for unweighted networks; for the same 
reasons, however, we expect the existence of scaling laws 
for weighted networks as well. We have shown that these 
scaling laws can be used to predict or estimate efficiently 
several quantities of interest, that are otherwise costly 
to compute on large networks. In particular, the largest 
typical node-to-node distance L*, the traditional indi- 
vidual node and edge centralities (diameter range) and 
the ranking of nodes and edges by their centrality val- 
ues. The latter is made possible by the existence of the 
phenomenon of fast freezing of the rank ordering by dis- 
tance, which we demonstrated both numerically and via 
analytic arguments. We have also introduced efficient al- 
gorithms for range-limited centrality measures for both 
unweighted and weighted networks. Although they have 
been presented for betweenness centrality, they can be 
modified to obtain all the other centrality measure vari- 
ants. Finally, we presented an application of these con- 
cepts in identifying the vulnerability backbone of a net- 
work, and have shown that it can be identified efficiently 
using range-limited betweenness centralities. We have 
also illustrated the importance of taking into account 
link- weights [75] when computing centralities, even in 
networks where graph topology and weights are strongly 
correlated. 
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